Abstract:Next-generation Internet-of-Things (IoT) is evolving toward a ubiquitous, ultra-low-power, and multi-band heterogeneous networking paradigm that seamlessly integrates terrestrial, non-terrestrial, and ambient devices. This vision places unprecedented demands on conventional radio frequency (RF) receivers, whose fundamental bottlenecks in sensitivity, power consumption, coverage, and multi-band operation are rooted in the RF antenna. To tackle these issues, we show that the quantum properties of Rydberg atomic quantum receivers (RAQRs), including ultra-high sensitivity, broad frequency agility, and diverse reception modalities, provide a physically distinct receiver-side path that replaces the conventional antenna-and-low-noise-amplifier chain. Using LoRa, narrowband IoT, and ambient IoT as case studies, this article shows that RAQRs deliver significant gains in weak-uplink, low-power, and battery-free regimes. A stochastic-geometry analysis in cellular and cell-free architectures then maps these device-level gains onto network coverage, where the RAQR retains roughly a 4 dB half-coverage advantage over the RF receiver in sparse deployments at \(λ\sim 10^{-5}~{\mathrm m}^{-2}\), with the gain eroded as device density grows. The open challenges are presented to stand between current RAQR prototypes and deployable IoT infrastructure.
Abstract:Wireless digital twins require repeated synchronization between a time-evolving physical scene and its digital counterpart under limited and time-varying communication resources. For perception-centric twins, pixel-domain transmission or uniformly protected bitstreams can be mismatched to the semantic state consumed by twin-side applications. This paper proposes TWIST, a closed-loop token synchronization framework for application-aware wireless digital twins. TWIST represents each physical observation as a token and synchronizes this state over a wireless link, rather than optimizing visual reconstruction. Token positions are grouped by task relevance and protected through mode-conditioned unequal error protection under low-, medium-, and high-synchronization modes. At the twin side, decoding confidence converts unreliable hard token decisions into erasures, which are restored by a completion model before updating the semantic twin state. The recovered state supports traffic-state inference and generates compact feedback statistics, including channel quality, receiver uncertainty, semantic drift, and application priority, for subsequent mode adaptation. Experiments on a dynamic road-scene digital-twin scenario show that TWIST improves traffic-state inference and semantic twin-state synchronization compared with fixed-mode and channel-only adaptation strategies, while reducing the average synchronization cost relative to always-high transmission.
Abstract:Tokens are becoming the basic units through which foundation models represent and process information for understanding and inference. However, traditional wireless communication, centered on bit-level fidelity, faces a mismatch between what is transmitted reliably and what downstream models actually consume. This mismatch calls for a communication design that directly accounts for token-level task relevance and downstream model requirements, rather than treating all transmitted bits as equally important. In this paper, we propose TONIC, a token-centric semantic communication framework for task-oriented wireless systems. The transmitter converts each source sample into a sequence of tokens, estimates token-level task relevance, and allocates protection through utility-aware unequal error protection under a fixed channel-use budget. At the receiver, token-level confidence is used to gate unreliable decisions, turning harmful substitutions into recoverable erasures before a Transformer-based completion model restores the masked tokens for final task inference. Our framework combines transmitter-side semantic-aware protection with receiver-side confidence-aware gating in a modular and interpretable architecture, rather than relying solely on fully black-box end-to-end learning. We further establish a utility-aware Bayes-risk interpretation for the receiver-side gating rule and study its interaction with unequal protection and completion. Experimental results on image classification show that TONIC consistently outperforms separation-based schemes, the pixel-domain DeepJSCC baseline, and token-domain baselines under matched communication budgets over AWGN, Rayleigh, and Rician channels.
Abstract:The Six-Dimensional Movable Antenna (6DMA) system has emerged as a promising technology to enhance wireless capacity by fully exploiting spatial degrees of freedom. However, applying 6DMA to high-mobility Internet of Vehicles (IoV) scenarios faces significant challenges, primarily due to the difficulty of acquiring instantaneous Channel State Information (CSI) and the risk of service interruptions caused by mechanical reconfiguration delays. To address these issues, this paper proposes a low-complexity, CSI-free single-step reconfiguration framework. First, we design a deterministic discrete position generation scheme based on a latitude-longitude grid with inherent topological structures. Leveraging graph theory, we explicitly model and theoretically derive the lower bounds of movement and time costs for antenna reconfiguration. Subsequently, utilizing the directional sparsity of 6DMA channels, we develop an adaptive optimization strategy that fuses offline environmental priors with online historical feedback. Furthermore, a periodic reconfiguration mechanism based on predicted cumulative vehicle distributions is introduced. By strictly restricting antenna adjustments to the first-order spatial neighborhood, the proposed single-step method effectively eliminates service interruptions. Simulation results demonstrate that the proposed scheme significantly outperforms traditional fixed and global-search-based benchmarks in terms of uplink sum rate, while incurring negligible mechanical overhead and latency, thereby validating its feasibility and robustness in highly dynamic vehicular networks.
Abstract:This paper studies end-to-end latency minimization for a multi-band radar sensing and deep neural network (DNN) inference pipeline. Unlike conventional stage-wise designs that treat radar sensing and DNN inference as two sequential stages, the proposed framework exploits cross-stage parallelism by allowing the inference branch associated with a sensed band to start as soon as that band completes sensing, without waiting for all bands to finish. To characterize this interaction, we formulate a joint scheduling problem that couples sensing-time allocation, branch release timing, and non-preemptive multi-core execution of a directed acyclic graph (DAG) under sensing-feasibility, precedence, and core-capacity constraints. Since the resulting problem is combinatorial and strongly time-coupled, we further develop a release-aware heuristic that evaluates each sensing decision according to its downstream impact on the DAG makespan, together with a greedy list scheduler for multi-core DAG execution under release times. Simulation results show that the proposed design can effectively exploit cross-stage parallelism and reduce end-to-end latency relative to a decoupled baseline in many heterogeneous sensing scenarios, while also clarifying the operating regimes in which the latency gain becomes limited.
Abstract:Deploying six-dimensional movable antenna (6DMA) systems in Internet-of-Vehicles (IoV) scenarios can greatly enhance spectral efficiency. However, the high mobility of vehicles causes rapid spatio-temporal channel variations, posing a significant challenge to real-time 6DMA optimization. In this work, we pioneer the application of 6DMA in IoV and propose a low-complexity, instantaneous channel state information (CSI)-free dynamic configuration method. By integrating vehicle motion prediction with offline directional response priors, the proposed approach optimizes antenna positions and orientations at each reconfiguration epoch to maximize the average sum rate over a future time window. Simulation results in a typical urban intersection scenario demonstrate that the proposed 6DMA scheme significantly outperforms conventional fixed antenna arrays and simplified 6DMA baseline schemes in terms of total sum rate.
Abstract:This demonstration presents U-Parking, a distributed Ultra-Wideband (UWB)-assisted autonomous parking system. By integrating Large Language Models (LLMs)-assisted planning with robust fusion localization and trajectory tracking, it enables reliable automated parking in challenging indoor environments, as validated through real-vehicle demonstrations.
Abstract:Computational psychiatry faces a fundamental trade-off: traditional reinforcement learning (RL) models offer interpretability but lack behavioral realism, while large language model (LLM) agents generate realistic behaviors but lack structural interpretability. We introduce BioLLMAgent, a novel hybrid framework that combines validated cognitive models with the generative capabilities of LLMs. The framework comprises three core components: (i) an Internal RL Engine for experience-driven value learning; (ii) an External LLM Shell for high-level cognitive strategies and therapeutic interventions; and (iii) a Decision Fusion Mechanism for integrating components via weighted utility. Comprehensive experiments on the Iowa Gambling Task (IGT) across six clinical and healthy datasets demonstrate that BioLLMAgent accurately reproduces human behavioral patterns while maintaining excellent parameter identifiability (correlations $>0.67$). Furthermore, the framework successfully simulates cognitive behavioral therapy (CBT) principles and reveals, through multi-agent dynamics, that community-wide educational interventions may outperform individual treatments. Validated across reward-punishment learning and temporal discounting tasks, BioLLMAgent provides a structurally interpretable "computational sandbox" for testing mechanistic hypotheses and intervention strategies in psychiatric research.
Abstract:Accurate trajectory prediction is vital for safe autonomous driving, yet existing approaches struggle to balance modeling power and computational efficiency. Attention-based architectures incur quadratic complexity with increasing agents, while recurrent models struggle to capture long-range dependencies and fine-grained local dynamics. Building upon this, we present FoSS, a dual-branch framework that unifies frequency-domain reasoning with linear-time sequence modeling. The frequency-domain branch performs a discrete Fourier transform to decompose trajectories into amplitude components encoding global intent and phase components capturing local variations, followed by a progressive helix reordering module that preserves spectral order; two selective state-space submodules, Coarse2Fine-SSM and SpecEvolve-SSM, refine spectral features with O(N) complexity. In parallel, a time-domain dynamic selective SSM reconstructs self-attention behavior in linear time to retain long-range temporal context. A cross-attention layer fuses temporal and spectral representations, while learnable queries generate multiple candidate trajectories, and a weighted fusion head expresses motion uncertainty. Experiments on Argoverse 1 and Argoverse 2 benchmarks demonstrate that FoSS achieves state-of-the-art accuracy while reducing computation by 22.5% and parameters by over 40%. Comprehensive ablations confirm the necessity of each component.
Abstract:Visual-Language Models (VLMs), with their strong capabilities in image and text understanding, offer a solid foundation for intelligent communications. However, their effectiveness is constrained by limited token granularity, overlong visual token sequences, and inadequate cross-modal alignment. To overcome these challenges, we propose TaiChi, a novel VLM framework designed for token communications. TaiChi adopts a dual-visual tokenizer architecture that processes both high- and low-resolution images to collaboratively capture pixel-level details and global conceptual features. A Bilateral Attention Network (BAN) is introduced to intelligently fuse multi-scale visual tokens, thereby enhancing visual understanding and producing compact visual tokens. In addition, a Kolmogorov Arnold Network (KAN)-based modality projector with learnable activation functions is employed to achieve precise nonlinear alignment from visual features to the text semantic space, thus minimizing information loss. Finally, TaiChi is integrated into a multimodal and multitask token communication system equipped with a joint VLM-channel coding scheme. Experimental results validate the superior performance of TaiChi, as well as the feasibility and effectiveness of the TaiChi-driven token communication system.